[Security] Implement Security Option for Handling Tool-Call Chains#4691
[Security] Implement Security Option for Handling Tool-Call Chains#4691shellz-n-stuff wants to merge 5 commits intoblock:mainfrom
Conversation
DOsinga
left a comment
There was a problem hiding this comment.
overall looks great, there's a lot of debugging logs in there that presumably we don't need anymore, can you clear those up? same for the LLM style comments
| use serde_json::Value; | ||
|
|
||
| /// Result of a security scan. | ||
| #[derive(Debug, Clone)] |
crates/goose/src/security/scanner.rs
Outdated
| } | ||
|
|
||
| /// Get threshold from config | ||
| /// Get the maliciousness threshold from config, or use default. |
There was a problem hiding this comment.
if you're not happy with the current comment, rename the function (drop from_config, I don't think the caller cares where we store this)
crates/goose/src/security/scanner.rs
Outdated
| } | ||
|
|
||
| /// Scan with prompt injection model (legacy method name for compatibility) | ||
| /// Scan text for prompt injection (legacy compatibility). |
There was a problem hiding this comment.
ugh, we should have caught this the first place. remove this comment - legacy
| disabled_secondary_tool_list: &[String], | ||
| ) -> bool { | ||
| let tool_name = tool_call.name.as_str(); | ||
| if !disabled_secondary_tool_list.iter().any(|t| t == tool_name) { |
There was a problem hiding this comment.
I'd consider having the caller check this; that would leave this function with just one job, makes naming easier too
There was a problem hiding this comment.
I was thinking of just making it fully self-contained (IE it "just works").
I may split this into into a function though to make it cleaner/improve readability
There was a problem hiding this comment.
Actually looking at it, we re-use the tool_name var quite a lot. So I think it's really a call on if the caller should do it vs the function.
I personally like the function having the check but happy to change if you feel strongly
There was a problem hiding this comment.
you should keep the toolname of course. I can go either way on it, but the function is called is_secondary_tool_violation_single, not is_secondary_tool_violation_single_if_enabled
| } | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
you're going through the message twice here, I think you can do it in one go like:
for msg in messages.iter().rev():
if isUsere(msg): break
if toolCallName(msg) != toolCallName: return true
return true
There was a problem hiding this comment.
also, we have effective_role which will return tool for tool messages
|
config.yaml makes sense as we already have it for rules for commands, but in future I wonder if people will want to build a distro with these more role-based (ie protect people from themselves) but I digress... looking good so far! |
|
LGTM - but will want to check if this will be a problem with config.yaml: change here #4651 - does this imply the config section in config.yaml is changing cc @dorien-koelemeijer ? (ie no security: section?) |
Yeah, perhaps a comment on this whether we could also configure all of this through the UI settings and keep the config as all other config (with underscores)? See #4651 @shellz-n-stuff |
|
@dorien-koelemeijer, I can make the updates on this branch after your PR gets merged? Not in a mega rush here! |
Just wondering whether you wanted me to update/rename any config items or not? |
| /// Get threshold from config | ||
| pub fn get_threshold_from_config(&self) -> f32 { | ||
| /// Get the confidence threshold from config, or use default. | ||
| pub fn get_confidence_threshold_from_config(&self) -> f32 { |
There was a problem hiding this comment.
I thought we'd just call this get_confidence_threshold and drop the comment?
|
Just adding a comment for visibility instead of chatting over Slack - I've now updated the config for prompt injection to be as follows: |
|
do we still want this to land? |
|
I'm going to close this for now due to staleness. can always reopen when relevant again |
Pull Request Description
One of the key attacks against Goose is MCP poisoning. Using the response data from an MCP to trigger a non-expected tool-call that performs an unintended action.
Whilst we absolutely don't want to remove the utility of the agent autonomously deciding to call a chain of tools, users should be able to configure security profiles in their config to prevent certain tools being called as a "secondary tool" without user input.
This PR introduces such functionality
Implementation Detail
The general approach is to intercept tool-calls and perform a message look back to see what tools have been called since prior invocation. Specifically, the method is as follows:
Testing